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CROSS-REFERENCE TO RELATED APPLICATIONS 

^ Thj^s appliuaLiun — rs — a — continuation- itt-part of 
application Serial — Nq. 09/062,142, filed April 17, 1998, 
which application is penctt^ig^which claims the benefit of 
provisional application No. 607^^4*^185, filed April 24, 
1907 . ^ — 

BACKGROUND OF THE INVENTION 

Enzymes are used within a wide range of 
applications in industry, research, and medicine. Through 
the use of enzymes, industrial processes can be carried 
out at reduced temperatures and pressures and with less 
dependence on the use of corrosive or toxic substances. 
The use of enzymes can thus reduce production costs, 
energy consumption, and pollution as compared to non- 
enzymatic products and processes. 

An important group of enzymes is the proteases, 
which cleave proteins. Industrial applications of 

proteases include food processing, brewing, and alcohol 
production. Proteases are important components of laundry 
detergents and other products. Within biological 

research, proteases are used in purification processes to 
degrade unwanted proteins. It is often desirable to 
employ proteases of low specificity or mixtures of more 
specific proteases to obtain the necessary degree of 
degradation . 

Proteases are also key components of a broad 
range of biological pathways, including blood coagulation 
and digestion. For example, the absence or insufficiency 
of a protease can result in a pathological condition that 




can be treated by replacement or augmentation therapy. 
Such therapies include the treatment of hemophilia with 
clotting factors VIII, IX, and Vila. In another 

application, the proteolytic enzyme tissue plasminogen 
activator (t-PA) is used to activate the body f s clot 
lysing mechanism, thereby reducing morbitity resulting 
from myocardial infarction. The protease thrombin is used 
to initiate the clotting of f ibrinogen-based tissue 
adhesives during surgery. Neutrophils produce several 
antibacterial serine proteases (Gabay, Ciba Found . Svmp . 
186 : 237-247 , 1994; Scocchi et al . , Eur. J. Biochem. 
209 : 589-595 , 1992). Proteases also regulate cellular 

processes through receptor-mediated pathways by 
proteolytic activation of the cognate receptor (Vu et al . , 
Cell JL4: 1057-1068 , 1991; Blackhart et al . , J. Biol. Chem . 
271:16466-16471, 1996).. 

Overproduction or lack of regulation of 
proteases can also have pathological consequences. 
Elastase, released within the lung in response to the 
presence of foreign particles, can damage lung tissue if 
its activity is not tightly regulated. Emphysema in 
smokers is believed to arise from an imbalance between 
elastase and its inhibitor, alpha-l-antitrypsin. This 
balance may be restored by administration of exogenous 
alpha-l-antitrypsin . 

One family of proteases of particular interest 
is the serine proteases, which are characterized by a 
catalytic triad of serine, histidine, and aspartic acid 
residues. Serine proteases are used for a variety of 
industrial purposes. For example, the serine protease 
subtilisin is used in laundry detergents to aid in the 
removal of proteinaceous stains (e.g., Crabb, ACS 
Symposium Series 460 : 82-94 , 1991) . In the food processing 
industry, serine proteases are used to produce protein- 
rich concentrates from fish and livestock, and in the 
preparation of dairy products (Kida et al . , Journal of 
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Fermentation and Bioengineering 80:478-484, 1995; Haard 
and Simpson, in Martin, A.M., ed. , Fisheries Processing : 
Biotechnological Applications , Chapman and Hall, London, 
1994, 132-154; Bos et al . , European Patent Office 
5 Publication 494 149 Al) . 

In general, enzymes, including proteases, are 
active over a narrow range of environmental conditions 
(temperature, pH, etc.), and many are highly specific for 
particular substrates. The narrow range of activity for a 
10 given enzyme limits its applicability and creates a need 
for a selection of enzymes that (a) have similar 
activities but are active under different conditions or 
(b) have different substrates. For instance, an enzyme 
JJ. capable of catalyzing a reaction at 50 °C may be so 

fn 15 inefficient at 3 5 °C that its use at the lower temperature 
y. will not be feasible. For this reason, laundry detergents 

a " generally contain a selection of proteolytic enzymes, 

G allowing the detergent to be used over a broad range of 

S wash temperature and pH . 

£0 20 In view of the specificity of proteolytic 

S enzymes and the growing use of proteases in industry, 

^ research, and medicine, there is an ongoing need in the 

art for new enzymes and new enzyme inhibitors. The 
present invention addresses these needs and provides 
25 other, related advantages. 

SUMMARY OF THE INVENTION 

Within one aspect, the present invention 

provides an isolated protein comprising a sequence of 
30 amino acid residues that is at least 95% identical to SEQ 

ID NO: 2 from lie, residue 111, through Asn, residue 373, 

wherein the protein is a protease or protease precursor. 

In one embodiment, the protein has from 254 to 398 amino 

acid residues. In other embodiments, the protein 

35 comprises residues 111 through 3 73 of SEQ ID NO: 2 or SEQ 

ID NO: 15, residues 111 through 3 64 of SEQ ID NO: 18, 
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residues 1 through 373 of SEQ ID NO: 2 or SEQ. ID NO: 15, or 
residues 1 through 364 of SEQ ID NO: 18. The protein can 
further comprise a heterologous affinity tag or binding 
domain. 

5 Within a second aspect, the invention provides 

an isolated polynucleotide up to 1800 nucleotides in 
length encoding a protein as disclosed above. Within one 
embodiment, the polynucleotide is DNA. Within another 
embodiment, the polynucleotide is double- stranded DNA. 
10 Within a further embodiment, the protein encoded by the 
polynucleotide comprises residues -19 through 373 of SEQ 
¥ ID NO: 2 . 

01 , Within a third aspect, the invention provides an 

^1 expression vector comprising the following operably linked 

£0 

is elements: (a) a transcription promoter; (b) a DNA segment 
Si encoding a protein as disclosed above; and (c) a 

J 4 transcription terminator. The expression vector can 

O further comprise a secretory signal sequence operably 

2f linked to the DNA segment. 

q§ 20 The invention also provides a cultured cell 

O containing an expression vector as disclosed above, 

~ wherein the cell expresses the DNA segment. Within one 

embodiment of the invention the expression vector further 
comprises a secretory signal sequence operably linked to 
25 the DNA segment, and the cell secretes the protein. 

There is also provided a method of making a 
protease or protease precursor. The method comprises the 
steps of (a) providing a host cell containing an 
expression vector as disclosed above; (b) culturing the 
30 host cell under conditions whereby the DNA segment is 
expressed; and (c) recovering the protein encoded by the 
DNA segment. Within one embodiment the expression vector 
further comprises a secretory signal sequence operably 
linked to the DNA segment, the cell secretes the protein 
3 5 into a culture medium, and the protein is recovered from 
the medium. 
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Within a further aspect of the invention there 
is provided a method of cleaving a peptide bond of a 
substrate protein. The method comprises incubating the 
substrate protein in the presence of a second protein 
5 comprising a sequence of amino acid residues that is at 
least 95% identical to SEQ ID NO: 2 from lie, residue 111, 
through Asn, residue 373, whereby the peptide bond is 
cleaved. Within one embodiment, the second protein is a 
protease precursor and the method further comprises the 
10 step of activating the second protein before the peptide 
bond is cleaved. 

The invention further provides a method of 
|S detecting an inhibitor of proteolysis within a test sample 

I/I comprising the steps of (a) measuring proteolytic activity 

JE 15 of a protein as disclosed above in the presence of a test 
S! sample to obtain a first value; (b) measuring proteolytic 

™ activity of the protein in the absence of the test sample 

p to obtain a second value; and (c) comparing the first and 

= second values, whereby a higher second value relative to 

20 the first value is indicative of an inhibitor of 
O proteolysis within the test sample. 

^ The invention also provides an antibody that 

specifically binds to a protein comprising a sequence of 
amino acid residues that is at least 95% identical to SEQ 
25 ID NO: 2 from lie, residue 111, through Asn, residue 373, 
wherein the protein is a protease or protease precursor. 

Within an additional aspect, the invention 
provides a DNA construct encoding a polypeptide fusion. 
The polypeptide fusion comprises, from amino terminus to 
3 0 carboxyl terminus, amino acid residues -19 through -1 of 
SEQ ID NO: 2 operably linked to an additional polypeptide. 

These and other aspects of the invention will 
become evident upon reference to the following detailed 
description of the invention. 

35 
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DETAILED DESCRIPTION OF THE INVENTION 

Prior to setting forth the invention in detail, 
certain terms used herein will be defined. 

The term "allelic variant" denotes any of two or 
5 more alternative forms of a gene occupying the same 
chromosomal locus. Allelic variation arises naturally 
through mutation, and may result in phenotypic 
polymorphism within populations. Gene mutations can be 
silent (no change in the encoded polypeptide) or may 
10 encode polypeptides having altered amino acid sequence. 
The term "allelic variant 11 is also used herein to denote a 
protein encoded by an allelic variant of a gene. 

The term "complements of polynucleotide 
molecules" denotes polynucleotide molecules having a 
15 complementary base sequence and reverse orientation as 
compared to a reference sequence. For example, the 
sequence 5* ATGCACGGG 3" is complementary to 5 1 CCCGTGCAT 
3 1 . 

The term "degenerate nucleotide sequence" 
20 denotes a sequence of nucleotides that includes one or 
more degenerate codons (as compared to a reference 
polynucleotide molecule that encodes a polypeptide) . 
Degenerate codons contain different triplets of 
nucleotides, but encode the same amino acid residue (i.e., 

2 5 GAU and GAC triplets each encode Asp) . 

A "DNA construct" is a single or .double 
stranded, linear or circular DNA molecule that comprises 
segments of DNA combined and juxtaposed in a manner not 
found in nature. DNA constructs exist as a result of 

3 0 human manipulation, and include clones and other copies of 

manipulated molecules . 

A "DNA segment" is a portion of a larger DNA 
molecule having specified attributes. For example, a DNA 
segment encoding a specified polypeptide is a portion of a 
35 longer DNA molecule, such as a plasmid or plasmid 
fragment, that, when read from the 5 1 to the 3' direction. 
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encodes the sequence of amino acids of the specified 
polypeptide . 

The term "expression vector" denotes a DNA 
construct that comprises a segment encoding a polypeptide 
5 of interest operably linked to additional segments that 
provide for its transcription in a host cell. Such 
additional segments may include promoter and terminator 
sequences, and may optionally include one or more origins 
of replication, one or more selectable markers , an 

10 enhancer, a polyadenylation signal, and the like. 
Expression vectors are generally derived from plasmid or 
viral DNA, or may contain elements of both. 

The term "isolated", when applied to a 
polynucleotide molecule, denotes that the polynucleotide 

15 has been removed from its natural genetic milieu and is 
thus free of other extraneous or unwanted coding 
sequences, and is in a form suitable for use within 
genetically engineered protein production systems. Such 
isolated molecules are those that are separated from their 

20 natural environment and include cDNA and genomic clones, 
as well as synthetic polynucleotides. Isolated DNA 

molecules of the present invention may include naturally 
occurring 5 f and 3 1 untranslated regions such as promoters 
and terminators. The identification of associated regions 

25 will be evident to one of ordinary skill in the art (see 
for example, Dynan and Tijan, Nature 316 : 774-78 , 1985) . 
When applied to a protein, the term "isolated" indicates 
that the protein is found in a condition other than its 
native environment, such as apart from blood and animal 

30 tissue. In a preferred form, the isolated protein is 
substantially free of other proteins, particularly other 
proteins of animal origin. It is preferred to provide the 
protein in a highly purified form, i.e., at least 90% 
pure, preferably greater than 95% pure, more preferably 

35 greater than 99% pure. 
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The term "operably linked" , when referring to 
DNA segments, denotes that the segments are arranged so 
that they function in concert for their intended purposes, 
e.g. transcription initiates in the promoter and proceeds 
5 through the coding segment to the terminator. 

The term "ortholog" denotes a polypeptide or 
protein obtained from one species that is the functional 
counterpart of a polypeptide or protein from a different 
species. Sequence differences among orthologs are the 
10 result of speciation. 

The term "polynucleotide" denotes a single- or 
y double -stranded polymer of deoxyribonucleotide or 

m ribonucleotide bases read from the 5 1 to the 3 1 end. 

Polynucleotides include RNA and DNA, and may be isolated 
J| 15 from natural sources, synthesized in vitro, or prepared 

Si from a combination of natural and synthetic molecules. 

The length of a polynucleotide molecule is given herein in 
S terms .of nucleotides (abbreviated "nt") or base pairs 

Jtj (abbreviated "bp"). The term "nucleotides 11 is used for 

H3 2 0 both single- and double- stranded molecules where the 
3 context permits. When the term is applied to double- 

^ stranded molecules it is used to denote overall length and 

will be understood to be equivalent to the term "base 
pairs". It will be recognized by those skilled in the art 
25 that the two strands of a double -stranded polynucleotide 
may differ slightly in length and that the ends thereof 
may be staggered as a result of enzymatic cleavage; thus 
all nucleotides within a double -stranded polynucleotide 
molecule may not be paired. Such unpaired ends will in 
3 0 general not exceed 20 nt in length. 

The term "promoter" denotes a portion of a gene 
containing DNA sequences that provide for the binding of 
RNA polymerase and initiation of transcription. Promoter 
sequences are commonly, but not always, found in the 5' 
35 non-coding regions of genes. 




A "protease" is an enzyme that cleaves peptide 
bonds in proteins. A "protease precursor" is a relatively 
inactive form of the enzyme that commonly becomes 
activated upon cleavage by another protease. 

The term "secretory signal sequence" denotes a 
DNA sequence that encodes a polypeptide (a "secretory 
peptide") that, as a component of a larger polypeptide, 
directs the larger polypeptide through a secretory pathway 
of a cell in which it is synthesized. The larger 
polypeptide is commonly cleaved to remove the secretory 
peptide during transit through the secretory pathway. 

All references cited herein are incorporated by 
reference in their entirety. 

The present invention provides novel serine 
proteases, serine protease precursors, and useful 
polypeptide fragments thereof. The sequence of a 

representative protein of the present invention is shown 
in SEQ ID NO : 2 . This protein shows significant amino acid 
sequence homology to several serine proteases, including 
Bacillus licheniformis glutamyl endopeptidase (Svendsen 

and Breddam, Eur. J. Biochem. 204:165-171, 1992), human 
clotting factor X (Leytus et al . , Biochem. 25:5098-5102, 

1986) , human elastase (Kawashima et al . , DNA 6.:163-172, 

1987) , rat mast cell protease (Benfey et al . , J. Biol . 

Chem. 262:5377-5384, 1987), Streptomyces griseus trypsin 

(Kim et al . , Biochem. Biophys. Res. Comm. 181 : 707-713 , 
1991) , Hypoderma llneatum collagenase ( J. Biol. Chem. 

262:7546-7551, 1987), and bovine tripsinogen (Titani et 
al., Biochem. 14:1358-1366, 1975). The protein has been 
designated " Zsigl3 " . 

A Zsigl3 polynucleotide sequence was initially 
identified by querying a database of expressed sequence 
tags (ESTs) for secretory signal sequences characterized 
by an upstream methionine start site, a hydrophobic region 
of approximately 13 amino acid residues, and a cleavage 
site as defined by von Heijne ( Nuc . Acids Res. 14:4683, 
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1986) . Analysis of a full-length DNA (shown in SEQ ID 
NO:l) revealed its homology with other members of the 
serine protease family. Northern blot analysis indicated 
the presence of two corresponding messages, a predominant 
5 transcript of approximately 1.8 kb and a secondary 
transcript of approximately 4 kb. The sequence of SEQ ID 
N0:1 consists of 1634 bp, not including a poly (A) tail. 
The sequence includes an open reading frame of 117 6 base 
pairs . 

10 An alignment of Zsigl3 with related proteins was 

used to identify the catalytic triad of His (156) , Asp 
(227) and Ser (322) as shown in SEQ ID NO : 2 . The Leu-Thr- 
Ala-Ala-His-Cys sequence (residues 152-157 of SEQ ID NO: 2) 
is a characteristic active site His signature within 

15 serine proteases. Resides -1 through -19 of SEQ ID NO: 2 
make up a putative signal peptide. Residues 106-10 9 of 
SEQ ID NO: 2 (Arg-Arg-Lys - Arg) are a characteristic 
cleavage site; such cleavage may serve a regulatory 
function, such as activation of the protein during or 

20 after secretion. Activation by proteolytic cleavage is 
common among serine proteases . While not wishing to be 
bound by theory, the protein is believed to become active 
following exposure of a free amino group on Gin 110 or, 
with additional processing, lie 111. However, in contrast 

2 5 to many other serine proteases, the non-catalytic, amino- 
terminal fragment does not appear to remain tethered to 
the remainder of the molecule after this cleavage has 
occurred. Alignment of sequences further indicates that 
active site contact residues are at positions 244 (lie) , 

30 291 (Asp), 292 (Ala), 316 (Lys) , 317 (He), 328 (Asp) , -350 
(He), 356 (Gly) , 358 (Tyr) and 360 (Asp) of SEQ ID NO:2. 
Sequence alignment identified the Lys residue at position 
316 as the key residue in the base of the PI ligand 
specificity pocket, generating specificity for Glu and/or 

35 Asp in the PI position of the substrate protein. 




11 



With reference to SEQ ID NO : 2 , additional 
structural features of Zsigl3 include paired cysteine 
residues at positions 46 and 50, 141 and 157, 276 and 290, 
and 351 and 361. Potential N-linked glycosylation sites 
5 are at residues Asn-74 and Asn-188. The calculated 
molecular weight of the peptide backbone of the 3 92- 
residue precursor is 43,829.55, with a predicted pi of 
10.44. The calculated peptide backbone molecular weight 
of residues 110-373 is 30,074, with a predicted pi of 
10 10.4. 

The Zsigl3 protein was found to be highly 
expressed in tissues that are exposed to the external 
environment, including trachea, bladder, small intestine, 
U] colon, and prostate. This tissue distribution suggests a 

gfj 15 digestive or anti-bacterial function. Several anti- 

Nf bacterial serine proteases are known to be produced in 

neutrophils, where they are stored in granules as inactive 
p proforms (Gabay, ibid.; Scocchi et al . , ibid.). 

Expression was also detected in aorta and fetal kidney. 
20 The present invention also provides isolated 

Zsigl3 polypeptides that are substantially homologous to 
the polypeptides of SEQ ID NO: 2 and their orthologs . The 
term "substantially homologous" is used herein to denote 
polypeptides having 50%, preferably 60%, more preferably 
25 at least 80%, sequence identity to polypeptides of SEQ ID 
NO: 2 or their orthologs. Such polypeptides wLll more 
preferably be at least 90% identical, and most preferably 
95% or more identical to polypeptides of SEQ ID NO: 2 or 
their orthologs. Percent sequence identity is determined 
3 0 by conventional methods. See, for example, Altschul et 
al., Bull. Math. Bio. 48 : 603-616, 1986 and Henikoff and 
Henikoff, Proc. Natl. Acad. Sci . USA 89:10915-10919, 1992. 
Briefly, two amino acid sequences are aligned to optimize 
the alignment scores using a gap opening penalty of 10, a 
35 gap extension penalty of 1, and the "blosum 62" scoring 
matrix of Henikoff and Henikoff (ibid.) as shown in Table 
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1 (amino acids are indicated by the standard one- letter 
codes). The percent identity is then calculated as: 
Total number of identical matches 



x 100 



[length of the longer sequence plus the 
number of gaps introduced into the longer 
sequence in order to align the two 
sequences] 
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Sequence identity of polynucleotide molecules 
is determined by similar methods using a ratio as 
disclosed above . 

Substantially homologous proteins and 

5 polypeptides are characterized as having one or more 
amino acid substitutions, deletions or additions. These 
changes are preferably of a minor nature, that is 
conservative amino acid substitutions (see Table 2) and 
other substitutions that do not significantly affect the 
10 folding or activity of the protein or polypeptide; small 
deletions, typically of one to about 3 0 amino acids; and 
small amino- or carboxyl- terminal extensions, such as an 
amino- terminal methionine residue, a small linker peptide 
of up to about 20-25 residues, or a small extension that 
m 15 facilitates purification (an affinity tag) such as a 
y\ poly-histidine tract, protein A (Nilsson et al . , EMBO J . 

S~~ 4:1075, 1985; Nilsson et al . , Methods Enzymol . 198:3, 

Q 1991) , glutathione S transferase (Smith and Johnson, Gene 

"i? 67:31, 1988), maltose binding protein (Kellerman and 

5 20 Ferenci, Methods Enzymol . .90:459-463, 1982; Guan et al . , 

2 Gene 67.:21-30, 1987), thioredoxin, ubiquitin, cellulose 

^ binding protein, T7 polymerase, or other antigenic 

epitope or binding domain. See, in general Ford et al . , 
Protein Expression and Purification 2: 95-107, 1991. 
25 DNAs encoding affinity tags are available from commercial 
suppliers (e.g., Pharmacia Biotech, Piscataway, NJ; New 
England Biolabs, Beverly, MA) . Zsigl3 proteins 

comprising linkers, affinity tags, or other extensions 
will typically be from 274 to 398 residues in length, 
3 0 given a polypeptide having an amino terminus within 
residues 1-111 of SEQ ID NO:2 or SEQ ID NQ^4 ^ and a 
carboxyl terminus within residues 364-373 of SEQ ID NO: 2 
or SEQ ID NO: 15, and further comprising an extension of 
20-25 residues. Those skilled in the art will recognize 
35 that polypeptides comprising longer extensions are also 
within the scope of the present invention. 
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Table 2 

Conservative amino acid substitutions 



Ui 



10 



15 



20 



Basic : 

Acidic : 
Polar : 

Hydrophobic : 
Aromatic : 
Small : 



arginme 

lysine 

histidine 

glutamic acid 

aspartic acid 

glutamine 

asparagine 

leucine 

isoleucine 

valine 

phenylalanine 

tryptophan 

tyrosine 

glycine 

alanine 

serine 

threonine 

methionine 



The proteins of the present invention can also 
comprise non-naturally occuring amino acid residues. 
Non-natural ly occuring amino acids include, without 
25 limitation, trans- 3 -met hylproline, 2 , 4 -methanoproline , 
ci s - 4 - hydr oxypr o 1 ine , tran s - 4 - hydroxypr o line, N- 

methylglycine , alio- threonine , methylthreonine , 

. hydroxyethylcysteine , hydroxyethylhomocysteine , 

nitroglutamine, homoglutamine, pipecolic acid, tert- 

3 0 leucine, norvaline, 2 -azaphenylalanine , 3- 

azaphenylalanine, 4 -azaphenylalanine , and 4- 

f luorophenylalanine . Several methods are known in the 
art for incorporating non-naturally occuring amino acid 
residues into proteins. For example, an in vitro system 

3 5 can be employed wherein nonsense mutations are suppressed 
using chemically aminoacylated suppressor tRNAs . Methods 




for synthesizing amino acids and aminoacylating tRNA are 
known in the art. Transcription and translation of 
plasmids containing nonsense mutations is carried out in 
a cell free system comprising an E. coli S30 extract and 

commercially available enzymes and other reagents. 
Proteins are purified by chromatography. See, for 

example, Robertson et al., J. Am. Chem. Soc . 113:2722, 
1991; Ellman et al . , Methods Enzvmol . 2 02 :301, 1991; 
Chung et al . , Science 259.: 806-809, 1993; and Chung et 
al., Proc. Natl. Acad. Sci . USA 90:10145-10149, 1993). 
In a second method, translation is carried out in Xenopus 
oocytes by microinjection of mutated mRNA and chemically 
aminoacylated suppressor tRNAs (Turcatti et al . , J. Biol. 
Chem. 271:19991-19998, 1996) . Within a third method, E. 
coli cells are cultured in the absence of a natural amino 
acid that is to be replaced (e.g., phenylalanine) and in 
the presence of the desired non-naturally occuring amino 
acid(s) (e.g., 2 -azaphenylalanine , 3 -azaphenylalanine, 4- 
azaphenylalanine, or 4 -f luorophenylalanine) . The non- 
naturally occuring amino acid is incorporated into the 
protein in place of its natural counterpart. See, Koide 
et al., Biochem . 3^3:7470-7476, 1994. Naturally occuring 
amino acid residues can be converted to non-naturally 
occuring species by in vitro chemical modification. 

Chemical modification can be combined with site-directed 
mutagenesis to further expand the range of substitutions 

(Wynn and Richards, Protein Sci. 2:395-403, 1993) . 

Essential amino acids in the Zsigl3 
polypeptides of the present invention can be identified 
according to procedures known in the art, such as site- 
directed mutagenesis or alanine -scanning mutagenesis 

(Cunningham and Wells, Science 244: 1081-1085, 1989). In 
the latter technique, single alanine mutations are 
introduced at every residue in the molecule, and the 
resultant mutant molecules are tested for biological 
activity as disclosed above to identify amino acid 



residues that are critical to the activity of the 
molecule. See also, Hilton et al . , J. Biol . Chem. 
271 :4699-4708, 1996. Residues important for substrate 
binding and cleavage can also be determined by physical 
analysis of structure, as determined by such techniques 
as nuclear magnetic resonance, crystallography, electron 
diffraction or photoaf f inity labeling, in conjunction 
with mutation of putative contact site amino acids. See, 
for example, de Vos et al . , Science 255:306-312, 1992; 
Smith et al . , J. Mol . Biol. 224.: 899-904 , 1992; Wlodaver 
et al., FEBS Lett . 309 : 59-64 , 1992. The identities of 
essential amino acids can also be inferred from analysis 
of homologies with related serine proteases. 

Multiple amino acid substitutions can be made 
and tested using known methods of mutagenesis and 
screening, such as those disclosed by Reidhaar-Olson and 
Sauer ( Science 2j4i: 53-57, 1988) or Bowie and Sauer ( Proc . 
Natl. Acad. Sci . USA 86:2152-2156, 1989) . Briefly, these 
authors disclose methods for simultaneously randomizing 
two or more positions in a polypeptide, selecting for 
functional polypeptide, and then sequencing the 
mutagen! zed polypeptides to determine the spectrum of 
allowable substitutions at each position. Other methods 
that can be used include phage display (e.g., Lowman et 
al., Biochem. 30:10832-10837, 1991; Ladner et al . , U.S. 
Patent No. 5,223,409; Huse, WIPO Publication WO 92/06204) 
and region-directed mutagenesis (Derbyshire et al . , Gene 
46:145, 1986; Ner et al . , DNA 7:127, 1988). 

Mutagenesis methods as disclosed above can be 
combined with high- throughput , automated screening 
methods to detect activity of cloned, mutagenized 
polypeptides in host cells. Mutagenized DNA molecules 
that encode proteolytically active proteins or precursors 
thereof can be recovered from the host cells and rapidly 
sequenced using modern equipment. These methods allow 
the rapid determination of the importance of individual 
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amino acid residues in a polypeptide of interest, and can 
be applied to polypeptides of unknown structure. 

Using the methods disclosed above, one of 
ordinary skill in the art can identify and/or prepare a 
variety of polypeptides that are substantially homologous 
to residues 111 through 373 of SEQ ID NO: 2 or allelic 
variants thereof and retain the proteolytic properties of 
the wild- type protein. Such polypeptides may include a 
targettirig moiety comprising additional amino acid 
residues that form an independently folding binding 
domain. Such domains include, for example, an 

extracellular ligand-binding domain (e.g., one or more 
fibronectin type III domains) of a cytokine receptor; 
immunoglobulin domains; DNA. binding domains (see, e.g., 
He et al. # Nature 378:92-96, 1995); affinity tags; and 
the like. Such polypeptides may also include additional 
polypeptide segments as generally disclosed above. 

In addition to the fusion proteins disclosed 
above, the present invention provides fusions comprising 
the secretory peptide of Zsigl3 (residues -19 through -1 
of SEQ ID NO:2) . This secretory peptide can be used to 
direct the secretion of other proteins of interest by 
joining a polynucleotide sequence encoding it to the 5' 
end of a sequence encoding a protein of interest. 

Within the present invention, proteins, 
including variants and fragments of SEQ ID NO:2, can be 
tested for serine protease activity using conventional 
assays. Briefly, substrate cleavage is conveniently 
assayed using a tetrapeptide that mimics the cleavage 
site of the natural substrate and which is linked, via a 
peptide bond, to a carboxyl- terminal para^nitro-anilide 
(pNA) group. The protease hydrolyzes the bond between 
the fourth amino acid residue and the pNA group, causing 
the pNA group to undergo a dramatic increase in 
absorbance at 405 nm. Such substrates will preferably 
contain a Glu or Asp residue at the PI position. 
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Suitable substrates can be synthesized according to known 
methods or obtained from commercial suppliers. When the 
serine protease is prepared as an inactive precursor 
(e.g., comprising N-terminal residues 1-109 of SEQ ID 
5 NO: 2), it is activated by cleavage with a suitable 
protease (e.g., furin (Steiner et al . , J. Biol. Chem. 
267 : 23435-23438 , 1992)) prior to assay. Assays of this 
type are well known in the art. See, for example, 
Lottenberg et al . , Thrombosis Research 28.:313-332, 1982; 
10 Cho et al., Biochem. 23:644-650, 1984; Foster et al . , 
_ Biochem. 26. : 7003 -7011 , 1987) . 

5 The isolated polynucleotides of the present 

£m invention include DNA and RNA. Methods for isolating DNA 

m and RNA are well known in the art. For example, RNA can 

ffi 15 be isolated from trachea, bladder, small intestine, 

,J1 colon, or prostate, which RNA is then used as a template, 

s for preparation of complementary DNA (cDNA) . DNA can 

™ also be prepared using RNA from other tissues or isolated 

p as genomic DNA. Total RNA can be prepared using 

85 20 guanidine HCl extraction followed by isolation by 

centrifugation in a CsCl gradient (Chirgwin et al . , 
Biochemistry 18:52-94, 1979). Poly (A) + RNA is prepared 
from total RNA using the method of Aviv and Leder ( Proc . 
Natl. Acad. Sci. USA ^9:1408-1412, 1972) . Complementary 
25 DNA (cDNA) is prepared from poly (A) + RNA using known 
methods. Polynucleotides encoding Zsigl3 polypeptides 
are then identified and isolated by, for example, 
hybridization or polymerase chain reaction (PCR) . 

Within SEQ ID NO : 1 and SEQ ID NO: 2, residues 
3 0 80, 95, 96, and 14 9 can be any amino acid residue 
(denoted as Xaa) . Within a preferred embodiment of the 
invention, residue 80 is Thr, residue 95 is Gin, residue 
96 is His, and residue 149 is Lys . 

A second Zsigl3 DNA sequence is shown in SEQ ID 
3 5 NO: 14 (with the corresponding amino acid sequence shown 
in SEQ ID NO:15). Within SEQ ID NO:15, residue 60 is 
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Glu, residue 80 is Thr, residue 95 is Gin, residue 96 is 
His, residue 149 is Lys, residue 299 is Ser, and residue 
369 is Pro. All other residues in SEQ ID NO: 15 are the 
same as their respective counterparts in SEQ ID NO: 2. 
5 The calculated molecular weight of the peptide backbone 
of the 392 -residue polypeptide shown in SEQ ID NO: 15 is 
43,918.56, with a predicted pi of 10.38. The calculated 
peptide backbone molecular weight of residues 110-3 73 is 
28,113.80, with a predicted pi of 10.49. 
10 A third Zsigl3 DNA sequence is shown in SEQ ID 

^ NO: 17, with the encoded amino acid sequence shown in SEQ 

3 ID NO: 18. SEQ ID NO: 18 is identical to SEQ ID NO: 15, but 

*j terminates at residue 3 64 (Gly) due to a one base pair 

insertion at position 1256 in SEQ ID NO: 17 relative to 
31 15 SEQ ID NO: 14. There are two additional differences 
between SEQ ID NO: 14 and SEQ ID NO: 17 in the 3 1 
* untranslated region (nucleotides 1291 and 1374 of SEQ ID 

2 NO:17). The calculated molecular weight of the 383- 

□ residue peptide backbone of SEQ ID NO: 18 is 43,003.55, 

B 20 with a predicted pi of 10.44. The calculated peptide 
J molecular weight of residues 110-364 is 29,124.01, with a 

predicted pi of 10.53. 

Those skilled in the art will recognize that 
the sequences disclosed herein are representative of the 
25 human Zsigl3 gene and polypeptide, and that allelic 
variation and alternative splicing are expected to occur. 
Allelic variants can be cloned by probing cDNA or genomic 
libraries from different individuals according to 
standard procedures. Allelic variants of the disclosed 
3 0 DNA sequences, including those containing silent 
mutations and those in which mutations result in amino 
acid sequence changes, are within the scope of the 
present invention, as are proteins which are allelic 
variants of the disclosed protein sequences. 
3 5 The invention also encompasses degenerate 

polynucleotide sequences encoding proteins as disclosed 
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above. Those skilled in the art will readily recognize 
that, in view of the degeneracy of the genetic code, 
considerable sequence variation is possible among these 
polynucleotide molecules. SEQ ID NO: 16 is a degenerate 
DNA sequence that encompasses all DNAs that encode the 
Zsigl3 polypeptide of SEQ ID NO: 15. Those skilled in the 
art will recognize that the degenerate sequence of SEQ ID 
NO: 16 also provides all RNA sequences encoding SEQ ID 
NO: 15 by substituting U for T. Thus, Zsigl3 polypeptide- 
encoding polynucleotides comprising segments of SEQ ID 
NO: 16 and their RNA equivalents are contemplated by the 
present invention . Table 3 sets forth the one- letter 
codes used within SEQ ID NO: 16 to denote degenerate 
nucleotide positions. "Resolutions" are the nucleotides 
denoted by a code letter. "Complement" indicates the 
code for the complementary nucleotide (s) . For example, 
the code Y denotes either C or T, and its complement R 
denotes A or G, A being complementary to T, and G being 
complementary to C. 



TABLE 3 

Nucleotide Resolutions Complement Resolutions 



A 


A 


T 


T 


C 


C 


G 


G 


G 


G 


C 


C 


T 


T 


A 


A 


R 


A | G 


Y 


C |T 


Y 


C | T 


R 


a|g 


M 


a|.c 


K 


g|t 


K 


G | T 


M 


a|c 


S 


C|G 


S 


C|G 


W 


A|T 


W 


a|t 


H 


a| c |t 


D 


A|G| 


B 


C |G|T 


V 


A|C| 
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Table 3, continued 

V a|c|g 

D A|G|T 

N a|c|g|t 



B C|G|T 
H A|C|T 

n a|c|g|t 



The degenerate codons used in SEQ ID NO: 16, 
encompassing all possible codons for a given amino acid, 
are set forth in Table 4, below. 



TABLE 4 



Amino One- Degenerate 

Acid Letter Codons Codon 

Code 



Cys 


C 


TGC 


TGT 










TGY 


Ser 


S 


AGC 


AGT 


TCA 


TCC 


TCG 


TCT 


WSN 


Thr 


T 


ACA 


ACC 


ACG 


ACT 






CAN 


Pro 


P 


CCA 


CCC 


CCG 


CCT 






CCN 


Ala 


A 


GCA 


GCC 


GCG 


GCT 






GCN 


Gly 


G 


GGA 


GGC 


GGG 


GGT 






GGN 


Asn 


N 


AAC 


AAT 










AAY 


Asp 


D 


GAC 


GAT 










GAY 


Glu 


E 


GAA 


GAG 










GAR 


Gin 


Q 


CAA 


CAG 










CAR 


His 


H 


CAC 


CAT 










CAY 


Arg 


R 


AGA 


AGG 


CGA 


CGC 


CGG 


CGT 


MGN 


Lys 


K 


AAA 


AAG 










AAR 


Met 


M 


ATG 












ATG 


He 


I 


ATA 


ATC 


ATT 








ATH 


* Leu 


L 


CTA 


CTC 


CTG 


CTT 


TTA 


TTG 


YTN 


Val 


V 


GTA 


GTC 


GTG 


GTT 






GTN 


Phe 


F 


TTC 


TTT 










TTY 


Tyr 


Y 


TAC 


TAT 










TAY 


Trp 


W 


TGG 












TGG 


Ter 




TAA 


TAG 


TGA 








TRR 


Asn | Asp 


B 














RAY 
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Table 4 , continued 

Glu|Gln Z SAR 

Any X NNN 

Gap - 



One of ordinary skill in the art will 
appreciate that some ambiguity is introduced in 
5 determining a degenerate codon, representative of all 
possible codons encoding each amino acid. For example, 
the degenerate codon for serine (WSN) can, in some 
circumstances, encode arginine (AGR) , and the degenerate 
codon for arginine (MGN) can, in some circumstances, 
10 encode serine (AGY) . A similar relationship exists 
Lq between codons encoding phenylalanine and leucine. Thus, 

W some polynucleotides encompassed by the degenerate 

C| sequence may encode variant amino acid sequences, but one 

SI of ordinary skill in the art can easily identify such 

15 variant sequences by reference to the amino acid sequence 
u3 of SEQ ID NO: 15. Variant sequences can be readily tested 

y for functionality as described herein. 

5 For any Zsigl3 polypeptide (e.g., SEQ ID 

Q NO: 18), including variants and fusion proteins, one of 

2 0 ordinary skill in the art can readily generate a fully 
degenerate polynucleotide sequence encoding that variant 
using the information set forth in Tables 3 and 4, above. 

Allelic variants and orthologs of the human 
Zsigl3 proteins disclosed herein can be obtained by 
25 conventional cloning methods. The DNA sequences shown in 
SEQ ID NO:l, SEQ ID NO: 14, SEQ ID NO: 17, and portions 
thereof can be used as probes or primers to prepare other 
polynucleotides from cells or libraries (including cDNA 
and genomic libraries) from humans or other animals of 
30 interest, particularly mammals including rodents, 
rabbits, ungulates, primates, and others of economic 
importance or biomedical interest. It is preferred to 
derive probes and primers from regions of the molecule 
that are relatively conserved within the family of serine 
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proteases, such as residues 141-146, 153-158, 209-214, 
and 224-229 of SEQ ID NO: 2. Methods for isolating 
additional polynucleotides are known in the art. For 
example, a cDNA can be cloned using mRNA obtained from a 
tissue or cell type that expresses the protein. Suitable 
sources of mRNA can be identified by probing Northern 
blots with probes designed from the sequences disclosed 
herein. Preferred sources of mRNA include trachea, small 
intestine, colon, prostate, and bladder. A library is 
then prepared from mRNA of a positive tissue or cell 
line. A cDNA of interest can then be isolated by a 
variety of methods, such as by probing with a complete or 
partial human cDNA or with one or more sets of degenerate 
probes based on the disclosed sequences. A cDNA can also 
be cloned using the polymerase chain reaction, or PCR 
(Mullis, U.S. Patent 4,683,202), using primers designed 
from the sequences disclosed herein. Of particular 
interest for cloning are degenerate probes and primers 
designed from the regions of SEQ ID NO : 2 disclosed above 
and alignment with other serine proteases. Families of 
preferred degenerate probes are shown in Table 5. 



Table 5 

Nucleotides 

(SEQ ID NO:1) Sense Complement 

582-598 TGY ACN GGN WSN HTN RT AY NAD NSW NCC NGT RCA 

(SEQ ID NO:3) (SEQ ID NO:4) 

618-634 ACN GCN GSN CAY TGY AT ^ AT RCA RTG NSC NGC NGT 

(SEQ ID NO:5) (SEQ ID NO:6) 

787-803 WY RTN CCN VWN GGN TGG CCA NCC NBW NGG NAY RW 

(SEQ ID NO:7) (SEQ ID NO:8) 

831-847 AYN RAY TAY GAY TAY GS SC RTA RTC RTA RTY NRT 

(SEQ ID NO:9) (SEQ ID NO:10) 



Within an additional method, the cDNA library 
can be used to transform or transfect host cells, and 
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expression of the cDNA of interest can be detected with 
an antibody that specifically binds to an epitope of a 
Zsigl3 polypeptide. Similar techniques can also be 
applied to the isolation of genomic clones. 
5 Within preferred embodiments of the invention 

the isolated polynucleotides will hybridize to similar 
sized regions of SEQ ID NO:l, SEQ ID NO : 14 , SEQ ID NO: 17, 
or a' sequence complementary to SEQ ID NO:l, SEQ ID NO: 14, 
or SEQ ID NO: 17, under stringent conditions. In general, 

10 stringent conditions are selected to be about 5°C lower 
than the thermal melting point (T m ) for . the specific 
sequence at a defined ionic strength and pH. The. T m is 
the temperature (under defined ionic strength and pH) at 
which 50% of the target sequence hybridizes to a 

15 perfectly matched probe. Typical stringent conditions 
are those in which the salt concentration does not exceed 
about 0.03 M at pH 7 and the temperature is at least 
about 60°C, with washes carried out in the presence of 
EDTA . 

20 The polypeptides of the present invention, 

including full-length proteins, fragments thereof, and 
fusion proteins, are produced in genetically engineered 
host cells according to conventional techniques. 
Suitable host cells are those cell types that can be 

2 5 transformed or transfected with exogenous DNA and grown 

in culture, and include bacteria, fungal cells, and 
cultured higher eukaryotic cells. Techniques for 

manipulating cloned DNA molecules and introducing 
exogenous DNA into a variety of host cells are disclosed 

3 0 by Sambrook et al . , Molecular Cloning: A La boratory 

Manual, 2nd ed., Cold Spring Harbor Laboratory Press, 
Cold Spring Harbor, NY, 1989. 

In general, a DNA sequence encoding a protein 
of the present invention is operably linked to a 
35 transcription promoter and terminator within an 
expression vector. The vector will commonly contain one 

2is 



or more selectable markers and one or more origins of 
replication, although those skilled in the art will 
recognize that within certain systems selectable markers 
can be provided on separate vectors, and replication of 
the exogenous DNA can be provided by integration into the 
host cell genome. Selection of promoters, terminators, 
selectable markers, vectors and other elements is a 
matter of routine design within the level of ordinary 
skill in the art . Many such elements are described in 
the literature and are available through commercial 
suppliers . 

To direct Zsigl3 polypeptides into the 
secretory pathway of a host cell, a secretory signal 
sequence (also known as a leader sequence, prepro 
sequence or pre sequence) is provided in the expression 
vector. The secretory signal sequence is joined to a DNA 
sequence encoding a Zsigl3 polypeptide in the correct 
reading frame. Secretory signal sequences are commonly 
positioned 5 1 to the DNA sequence encoding the protein of 
interest, although certain signal sequences may be 
positioned 3 ! to the DNA sequence of interest (see, e.g., 
Welch et al., U.S. Patent No. 5,037,743; Holland et al . , 
U.S. Patent No. 5,143,830). . The secretory signal 
se q Uence 0 f zsigl3 (e.g., the human secretory signal 
sequence of SEQ ID NO : 1 from nucleotide 105 to nucleotide 
161) is generally preferred for use in mammalian cells. 
Signals from host cell genes may be preferred in other 
types of cells (e.g., yeast cells). 

Yeast cells, particularly cells of the genus 
Sac char omyces , are suitable for use within the present 
invention. Methods for transforming yeast cells with 
exogenous DNA and producing recombinant proteins 
therefrom are disclosed by, for example, Kawasaki, U.S. 
Patent No. 4,599,311; Kawasaki et al . , U.S. Patent No. 
4,931,373; Brake, U.S. Patent No. 4,870,008; Welch et 
al., U.S. Patent No. 5,037,743; and Murray et al . , U.S. 
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Patent No. 4,845,075. A preferred vector system for use 
in yeast is the POT1 vector system disclosed by Kawasaki 

et al. (U.S. Patent No. 4,931,373), which allows 
transformed cells to be selected by growth in glucose - 
5 containing media. Transformation systems for other 

yeasts, including Hansenula polymorpha, 

Schizosaccharomyces pombe, Kluyveromyces lactis, 

Kluyveromyces fragilis, Ustilago maydis, Pichia pastoris, 
Pichia methanolica and Candida maltosa are known in the 
10 art. See, for example, Gleeson et al., J. Gen. 

Microbiol . 3459-3465 , 1986; Cregg, U.S. Patent No. 

4,882,279; and Hiep et al . , Yeast 9:1189-1197, 1993. 

The use of Pichia methanolica, as host for the 

production of recombinant proteins is disclosed in WIPO 
15 Publications WO 97/17450, WO 97/17451, WO 98/02536, and 
WO 98/02565; and U.S. Patent No. 5,716,808. DNA 
molecules for use in transforming P. methanolica will 

commonly be prepared as double -stranded, circular 
.plasmids, which are preferably linearized prior to 

20 transformation. For polypeptide -production in P. 

methanolica, it is preferred that the promoter and 
terminator in the plasmid be that of a P. methanolica 
gene, such as a P. methanolica alcohol utilization gene 
(AUG1 or AUG2) . Other useful promoters include those of 

25 the dihydroxyacetone synthase (DHAS) , formate 

dehydrogenase (FMD) , and catalase (CAT) genes. To 
facilitate integration of the DNA into the host 

chromosome, it is preferred to have the entire expression 
segment of the plasmid flanked at both ends by host DNA 

30 sequences. A preferred selectable marker for use in 
Pichia methanolica is a P. methanolica ADE2 gene, which 

encodes phosphor ibosyl - 5 - aminoimidazole carboxylase 

(AIRC; EC 4.1.1.21), which allows ade2 host cells to grow 

in the absence of adenine. For large-scale, industrial 
35 processes where it is desirable to minimize the use of 
methanol, it is preferred to use host cells in which both 
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methanol utilization genes (AUG1 and AUG2) are deleted. 

For production of secreted proteins, host cells deficient 
in vacuolar protease genes (PEP4 and PRB1) are preferred. 

Electroporation is used to facilitate the introduction of 
5 a plasmid containing DNA encoding a polypeptide of 
interest into P. methanolica cells. It is preferred to 
transform P. methanolica cells by electroporation using 
an exponentially decaying, pulsed electric field having a 
field strength of from 2.5 to 4.5 kV/cm, preferably about 
10 3.75 kV/cm, and a time constant (x) of from 1 to 40 
milliseconds, most preferably about 20 milliseconds. 

Other fungal cells are also suitable as host 
cells. For example, Aspergillus cells can be utilized 

according to the methods of McKnight et al . , U.S. Patent 
15 No. 4,935,349. Methods for transforming Acremonium 

chrysogenum are disclosed by Sumino et al . , U.S. Patent 

No. 5,162,228. 

Cultured mammalian cells can also be used as 
hosts. Methods for introducing exogenous DNA into 

20 mammalian host cells include calcium phosphate -mediated 
transfection (Wigler et al . , Cell 14.:725, 1978; Corsaro 
and Pearson, Somatic Cell Genetics 7.: 603, 1981: Graham 
and Van der Eb, Virology .52:456, 1973), electroporation 
(Neumann et al . , EMBO J . 1:841-845, 1982) and DEAE- 

25 dextran mediated transfection (Ausubel et al . , eds., 
Current Protocols in Molecular Biology , John Wiley and 
Sons, Inc., NY, 1987). The production of recombinant 
proteins in cultured mammalian cells is disclosed by, for 
example, Levinson et al . , U.S. Patent No. 4,713,339; 

30 Hagen et al . , U.S. Patent No. 4,784,950; Palmiter et al . , 
U.S. Patent No. 4,579,821; and Ringold, U.S. Patent No. 
4,656,134. Preferred cultured mammalian cells include 
the COS-1 (ATCC No. CRL 1650), COS-7 (ATCC No. CRL 1651), 
BHK (ATCC No. CRL 1632), BHK 570 (ATCC No. CRL 10314) and 

35 293 (ATCC No. CRL 1573; Graham et al . , J. Gen. Virol. 

36.: 59-72, 1977) cell lines. Additional suitable cell 
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lines are known in the art and available from public 
depositories such as the American Type Culture 
Collection, Rockville, Maryland. 

Other higher eukaryotic cells can also be used 
5 as hosts, including insect cells, plant cells and avian 
cells. Transformation of insect cells and. production of 
foreign proteins therein is disclosed by Guarino et al . , 
U.S. Patent No. 5,162,222 and Bang et al . , U.S. Patent 
No. 4,775,624. The use of Agrobacterium rhizogenes as a 

10 vector for expressing genes in plant cells has been 

_ reviewed by Sinkar et al . , J . Biosci . (Bangalore ) 11 : 47- 

S 58, 1987. 

0" Prokaryotic host cells for use in carrying out 

\n 

r~ the present invention include strains of the bacteria 

01 15 Escherichia, coli; Bacillus and other genera are also 

useful. Techniques for transforming these hosts and 
* expressing foreign DNA sequences cloned therein are well 

^ known in the art (see, e.g., Sambrook et al . , ibid.). 

q When expressing a Zsigl3 protein in bacteria such as E . 

CD 20 coli, the protein may be retained in the cytoplasm, 
5= typically as insoluble granules, or may be directed to 

the periplasmic space by a bacterial secretion sequence. 
In the former case, the cells are lysed, and the granules 
are recovered and denatured using, for example, guanidine 
25 isothiocyanate or urea. The denatured protein can then 
be then - refolded and dimerized by diluting the 
denaturant, such as by dialysis against a solution of 
urea and a combination of reduced and oxidized 
glutathione, followed by dialysis against a buffered 
3 0 saline solution. In the latter case, the protein can be 
recovered from the periplasmic space in a soluble and 
functional form by disrupting the cells (by, for example, 
sonication or osmotic shock) to release the contents of 
the periplasmic space and recovering the protein, thereby 
35 obviating the need for denaturation and refolding. 




The secretory peptide of Zsigl3 (residues -19 
through -1 of SEQ ID NO: 2) can be used to direct the 
secretion of other proteins of interest from a host cell. 
Such use is within the level of ordinary skill in the 
art. Briefly, a DNA segment encoding the Zsigl3 

secretory peptide is operably linked to a second DNA 
segment encoding a protein of interest within a host cell 
and the cell is cultured according to conventional 
methods as summarized below. The protein of interest is 
then recovered from the culture media. 

Transformed or transfected host cells are 
cultured according to conventional procedures in a 
culture medium containing nutrients and other components 
required for the growth of the chosen host cells. A 
variety of suitable media, including defined media and 
complex media, are known in the art and generally include 
a carbon source, a nitrogen source, essential amino 
acids, vitamins and minerals. Media may also contain 
such components as growth factors or serum, as required. 
The growth medium will generally select for cells 
containing the exogenously added DNA by, for example, 
drug selection or deficiency in an essential nutrient 
which is complemented by the selectable marker carried on 
the expression vector or co- transfected into the host 
cell. P. me tha.no! ica cells are cultured in a medium 

comprising adequate sources of carbon, nitrogen and trace 
nutrients at a temperature of about 25°C to 35°C. Liquid 
cultures are provided with sufficient aeration by 
conventional means, such as shaking of small flasks or 
sparging of fermentors. A preferred culture medium for 
P. methanolica. is YEPD. 

Recombinant Zsigl3 polypeptides (including 
chimeric polypeptides) can be purified from cells or cell 
culture media using conventional fractionation and 
purification methods and media. Ammonium sulfate 

precipitation and acid or chaotrope extraction may be 
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used for fractionation of samples. Exemplary 
purification steps include hydroxyapatite , size 
exclusion, FPLC and reverse-phase high performance liquid 
chromatography. Suitable anion exchange media include 
5 derivatized dextrans, agarose, cellulose, polyacrylamide , 
specialty silicas, and the like. Exemplary 
chromatographic media include those media derivatized 
with phenyl, butyl, or octyl groups, such as Phenyl - 
Sepharose FF (Pharmacia) , Toyopearl butyl 650 (Toso Haas, 
10 Mont gome ryvi lie, PA) , Octyl -Sepharose (Pharmacia) and the 
= like; or polyacrylic resins, such as Amberchrom CG 71 

03 (Toso Haas) and the like. Suitable solid supports 

fl include glass beads, silica-based resins, cellulosic 

pi resins, agarose beads, cross- linked agarose beads, 

ff? 15 polystyrene beads, cross -linked polyacrylamide resins and 

=-„! the like that are insoluble under the conditions in which 

s they are to be used. These supports can be modified with 

" reactive groups that allow attachment of proteins by 

p amino groups, carboxyl groups, sulfhydryl groups, 

2 20 hydroxyl groups and/or carbohydrate moieties. Examples 

q of coupling chemistries include cyanogen bromide 

activation, N-hydroxysuccinimide activation, epoxide 
activation, sulfhydryl activation, hydrazide activation, 
and carboxyl and amino derivatives for carbodiimide 
25 coupling chemistries. These and other solid media are 
well known and widely used in the art, and are available 
from commercial suppliers. Selection of a particular 
method is a matter of routine design and is determined in 
part by the properties of the chosen support. See, for 
3 0 example, Affinity Chromatography: Principles & Methods , 
Pharmacia LKB Biotechnology, Uppsala, Sweden, 1988. 
Activated serine proteases are preferably purified by 
binding to immobilized p-aminobenzamidine (e.g., 
Benzamidine -Sepharose®; Pharmacia) with subsequent 
35 elution using soluble benzamidine (Winkler et al . , 




32 



Bio/Technology 3.: 990, 1985; Mizuno et al . , Biochem. 
Bioohvs. Res. Comm. 144:807, 1987) . 

Proteins comprising affinity tags or other 
binding domains can be purified by exploiting the 
5 properties of the additional domain. For example, 

immobilized metal ion adsorption chromatography (IMAC) 
can be used to purify histidine-rich proteins, including 
proteins comprising poly-histidine tags. Briefly, a gel 
is first charged with divalent metal ions to form a 

10 chelate (Sulkowski, Trends in Biochem. 3:1-7, 1985) . 

Histidine-rich proteins will be adsorbed to this matrix 
with differing affinities, depending upon the metal ion 
used, and will be eluted by competitive elution, lowering 
the pH, or use of strong chelating agents. Other methods 

15 of purification include purification of glycosylated 
proteins by lectin affinity chromatography and ion 
exchange chromatography ("Guide to Protein Purification", 
Methods Enzvmol . , Vol. 182, M. Deutscher, (ed.), Academic 
Press, San Diego, 1990, pp. 529-39). 

20 Zsigl3 polypeptides can also be prepared 

through chemical synthesis. The polypeptides may be 
glycosylated or non-glycosylated; pegylated or non- 
pegylated; and may or may not include an initial 
methionine amino acid residue. 

25 When proteins are produced intracellularly 

(such as in prokaryotic host cells) or by in vitro 
synthesis, protein refolding (and optionally reoxidation) 
procedures as generally disclosed above are 
advantageously used. 

3 0 It is preferred to purify Zsigl3 proteins to 

>80% purity, more preferably to >90% purity, even more 
preferably >95%, and particularly preferred is a 
pharmaceutically pure state, that is greater than 99.9% 
pure with respect to contaminating macromolecules , 

3 5 particularly other proteins and nucleic acids, and free 
of infectious and pyrogenic agents. Preferably, a 




# • 

33 



purified protein is substantially free of other proteins, 
particularly other proteins of animal origin. 

Proteins of the present invention can be used 
within laboratory and industrial settings to cleave 
5 proteins for a variety of purposes that will be evident 
to those skilled in the art. The proteins can be used 
alone to provide specific proteolysis or can be combined 
with other proteases to provide a "cocktail" with a broad 
spectrum of activity. Representative laboratory uses 
10 include the removal of proteins from biological samples, 
„ such as preparations of nucleic acids; and for digesting 

y5 proteins in conjunction with peptide mapping and 

sequencing. Within industry, the proteins of the present 

y s 

m invention can be formulated in laundry detergents to aid 

8^ 15 in the removal of protein stains, and can be used, within 

the large scale preparation of recombinant proteins to 
b specifically cleave fusion proteins, including removing 

^ affinity tags. The proteins of the present invention can 

q be added to a variety of compositions and solutions as 

20 proteolytically active enzymes or as protease precursors. 
q In the latter arrangement, the protein is subsequently 

activated, such as by the addition of an activating 

protease . 

The proteins of the present invention are also 
25 useful as research reagents to identify novel protease 
inhibitors. Briefly, test samples (compounds, broths, 
extracts, and the like) are added to protease assays as 
disclosed above to determine their ability to inhibit 
substrate cleavage. Inhibitors identified in this way 
3 0 can be used in industry and research to reduce or prevent 
undesired proteolysis. As with proteases, inhibitors can 
be combined to increase the spectrum of activity. 

2sigl3 proteins and protein fragments can also 
be used to prepare antibodies that specifically bind to 
35 zsigl3 proteins. As used herein, the term "antibodies 11 
includes polyclonal antibodies, monoclonal antibodies, 
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antigen-binding fragments thereof such as F(ab') 2 and Fab 
fragments, single chain antibodies, and the like, 
including genetically engineered antibodies. Non-human 
antibodies can be humanized by grafting non-human CDRs 
5 onto human framework and constant regions, or by 
incorporating the entire non-human variable domains 
(optionally "cloaking" them with a human- like surface by 
replacement of exposed residues , wherein the result is a 
"veneered" antibody) . In some instances, humanized 

10 antibodies may retain non-human residues within the human 
variable region framework domains to enhance proper 
binding characteristics. Through humanizing antibodies, 
biological half-life can be increased, and the potential 
for adverse immune reactions upon administration to 
ys 15 humans is reduced. One skilled in the art can generate 

K| humanized antibodies with specific and different constant 

l_ domains (i.e., different Ig subclasses) to facilitate or 

O 

.~ inhibit various immune functions associated with 

Q particular antibody constant domains. Alternative 

20 techniques for generating or selecting antibodies useful 
p herein include in vitro exposure of lymphocytes to Zsigl3 

protein, and selection of antibody display libraries in 
phage or similar vectors (for instance, through use of 
immobilized or labeled Zsigl3 protein) . Antibodies are 
25 defined to be specifically binding if they bind to a 
Zsigl3 protein with an affinity at least 10-fold greater 
than the binding affinity to control (non-Zsigl3) 
protein. The affinity of a monoclonal antibody can be 
readily determined by one of ordinary skill in the art 
30 (see, for example, Scatchard, Ann. NY Acad. Sci. 51: 660- 
672, 1949) . 

Methods for preparing polyclonal and monoclonal 
antibodies are well known in the art (see for example, 
Hurrell, J. G. R., Ed., Monoclonal Hybridoma Antibodies: 
35 Techniques and Applications , CRC Press, Inc., Boca Raton, 
FL, 1982) . As would be evident to one of ordinary skill 
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in the art, polyclonal antibodies can be generated from a 
variety of warm-blooded animals such as horses, cows, 
goats, sheep, dogs, chickens, rabbits, mice, and rats. 
The immunogenicity of a Zsigl3 polypeptide can be 
5 increased through the use of an adjuvant such as alum 
(aluminum hydroxide) or Freund's complete or incomplete 
adjuvant. Polypeptides useful for immunization also 
include fusion polypeptides, such .as fusions of a Zsigl3 
protein or a portion thereof with an immunoglobulin 
10 polypeptide or with maltose binding protein. The 
_ polypeptide immunogen may be a full-length molecule or a 

Jj portion thereof. If the polypeptide portion is "hapten- 

ff} like", such portion may be advantageously joined or 

^ linked to a macromolecular carrier (such as keyhole 

m 15 limpet hemocyanin (KLH) , bovine serum albumin (BSA) or 

~~i tetanus toxoid) for immunization. 

■=s. 5 
"■a 

= A variety of assays known to those skilled m 

Q the art can be utilized to detect antibodies which 

q specifically bind to Zsigl3 proteins. Exemplary .assays 

ffl 20 are described in detail in Antibodies: A Laboratory 

~ Manual, Harlow and Lane (Eds.), Cold Spring Harbor 

Laboratory Press, 1988. Representative examples of such 
assays include: concurrent Immunoelectrophoresis, radio- 
immunoassays , radio- immunoprecipitat ions , enzyme-linked 
25 immunosorbent assays (ELISA) , dot blot assays, Western 
blot assays, inhibition or competition assays, and 
sandwich assays . 

Antibodies to Zsigl3 proteins can be used for 
affinity purification of the protein, within diagnostic 
3 0 assays for determining circulating levels of the protein; 

for detecting or quantitating soluble Zsigl3 protein or 
protein fragments as a marker of underlying pathology or 
disease; for immunolocalization within whole animals or 
tissue sections, including immunodiagnostic applications ; 
35 for immunohi s t ochemi s t ry ; and as antagonists to block 
protein activity in vitro and in vivo. Antibodies to 
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Zsigl3 can also be used for tagging cells that express 
Zsigl3; for affinity purification of Zsigl3 proteins; in 
analytical methods employing FACS; for screening 
expression libraries; and for generating anti-idiotypic 
5 antibodies. For certain applications, including in vitro 
and in vivo diagnostic uses, it is advantageous to employ 
labeled antibodies. Suitable direct tags or labels 
include radionuclides, enzymes, substrates, cof actors, 
inhibitors, fluorescent markers, chemiluminescent 

10 markers, magnetic particles and the like; indirect tags 
or labels may feature use of biotin-avidin or other 
complement/anti-complement pairs as intermediates. 
Antibodies of the present invention can also be directly 
or indirectly conjugated to drugs, toxins, radionuclides 

15 and the like, and these conjugates used for in vivo 
diagnostic or therapeutic applications. 

While not wishing to be bound by theory, tissue 
distribution of Zsigl3 mRNA suggests that the protein may 
play a defensive role. Proteases that serve anitbiotic 

20 or antitoxin functions are known (Gabay, ibid. ; Scocchi 
et al., ibid.). Proteins of the present invention may 
thus be useful as antibiotics and/or antitoxins. They 
may further be used as diagnostic indicators of infection 
by assaying body fluids for the presence of Zsigl3 . 

25 Zsigl3 proteins or fragments thereof can be detected 
using, for example, immunoassay techniques employing 
antibodies specific for Zsigl3 epitopes. Assays can be 
performed using soluble or immobilized antibodies in a 
variety of known formats. 

3 0 A Zsigl3 gene, a probe comprising Zsigl3 DNA or 

RNA, or a subsequence thereof can be used to determine if 
the Zsigl3 gene is present on chromosome 11 or if a 
mutation has occurred. Detectable chromosomal 

aberrations at the Zsigl3 gene locus include, but are not 

35 limited to, aneuploidy, gene copy number changes, 
insertions, deletions, restriction site changes and 
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rearrangements. These aberrations can occur within the 
coding sequence, within introns, or within flanking 
sequences, including upstream promoter and regulatory 
regions, and may be manifested as physical alterations 
within a coding sequence or changes in gene expression 
level. Analytical probes will generally be at least 20 
nucleotides in length, although somewhat shorter probes 
(14-17 nucleotides) can be used. PCR primers are at 
least 5 nucleotides in length, preferably 15 or more nt, 
more preferably 20-30 nt . Short polynucleotides can be 
used when a small region of the gene is targetted for 
analysis. For gross analysis of genes, a polynucleotide 
probe may comprise an entire exon or more. Probes will 
generally comprise a polynucleotide linked to a signal- 
generating moiety such as a radionucleotide . In general, 
gene-based diagnostic methods comprise the steps of (a) 
obtaining a genetic sample from a patient; (b) incubating 
the genetic sample with a polynucleotide probe or primer 
as disclosed above, under conditions wherein the 
polynucleotide will hybridize to complementary 
polynucleotide sequence, to produce a first reaction 
product; and (iii) comparing the first reaction product 
to a control reaction product. A difference between the 
first reaction product and the control reaction product 
is indicative of a genetic abnormality in the patient. 
Genetic samples for use within the present invention 
include genomic DNA, cDNA, and RNA. The polynucleotide 
probe or primer can be RNA or DNA, and will comprise a 
portion of SEQ ID NO : 1 , SEQ ID NO: 14, or SEQ ID NO: 17; 
the complement of SEQ ID NO:l, SEQ ID NO: 14, or SEQ ID 
NO: 17; or an RNA equivalent thereof. Suitable assay 
methods in this regard include molecular genetic 
techniques known to those in the art, such as restriction 
fragment length polymorphism (RFLP) analysis, short 
tandem repeat (STR) analysis employing PCR techniques, 
ligation chain reaction (Barany, PCR Methods and 
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Applications .1:5-16, 1991), ribonuclease protection 
assays, and other genetic linkage analysis techniques 
known in the art (Sambrook et al . , ibid.; Ausubel et . 
al., ibid.; A.J. Marian, Chest 108:255-65, 1995). 

5 Ribonuclease protection assays (see, e.g., Ausubel et 
al., ibid., ch. 4) comprise the hybridization of an RNA 

probe to a patient RNA sample, after which the reaction 
product (RNA- RNA hybrid) is exposed to RNase . Hybridized 
regions of the RNA are protected from digestion. Within 
10 PCR assays, a patient genetic sample is incubated with a 
pair of polynucleotide primers, and the region between 
the primers is amplified and recovered. Changes in size 
or amount of recovered product are indicative of 
mutations in the patient. Another PCR-based technique 
5* 15 that can be employed is single strand conformational 

polymorphism (SSCP) analysis (Hayashi, PCR Methods and 
a " Applications 1:34-38, 1991) . 

S Radiation hybrid mapping is a somatic cell 

genetic technique developed for constructing high- 
20 resolution, contiguous maps of mammalian chromosomes (Cox 
et al., Science 250:245-250, 1990). Partial or full 
knowledge of a gene's sequence allows one to design PCR 
primers suitable for use with chromosomal radiation 
hybrid mapping panels. Commercially available radiation 
25 hybrid mapping panels that cover the entire human genome, 
such as the Stanford G3 RH Panel and the GeneBridge 4 RH 
Panel (Research Genetics, Inc., Huntsville, AL) , are 
available. These panels enable rapid, PCR-based 

chromosomal localizations and ordering of genes, 
30 sequence -tagged sites (STSs) , and other nonpolymorphic 
and polymorphic markers within a region of interest. 
This technique allows one to establish directly 
proportional physical distances between newly discovered 
genes of interest and previously mapped markers. The 
35 precise knowledge of a gene 1 s position can be useful for 
a number of purposes, including: 1) determining 




relationships between short sequences and obtaining 
additional surrounding genetic sequences in various 
forms, such as YACs, BACs or cDNA clones; 2) providing a 
possible candidate gene for an inheritable disease which 
shows linkage to the same chromosomal region; and 3) 
cross-referencing model organisms, such as mouse, which 
may aid in determining what function a particular gene 
might have . 

The invention is further illustrated by the 
following, non- limiting examples. 

Example 1 

Tissue distribution of Zsigl3 mRNA was analyzed 
using. Human Multiple Tissue Northern Blots (obtained from 
Clontech, Inc., Palo Alto, CA) . A 40-bp DNA probe (ZC 
11,667; SEQ ID NO: 11) was radioact ively labeled with 32 P 
using T4 polynucleotide kinase and forward reaction 
buffer (GIBCO BRL, Gaithersburg, MD) according to the 
supplier f s specifications. The probe was purified using 
a push column (Nuctrap™ . column; Stratagene Cloning 
Systems, La Jolla, CA) . Prehybridization and 

hybridization were carried out in a commercially 
available solution (ExpressHyb™ hybridization solution ; 
Clontech Laboratories, Inc., Palo Alto, CA) . Blots were 
hybridized overnight at 42°C, washed in 2X SSC, 0.0 5% SDS 
at room temperature, then in IX SSC, 0.1% SDS at 60°C. 
Two transcripts were observed: a strongly hybridizing 
"1.8 kb band and a fainter band at approximately 4.0 kb. 

An RNA Master Dot Blot (Clontech Laboratories) 
that contained RNAs from various tissues that were 
normalized to eight housekeeping genes was also probed 
with the 40-bp oligonucleotide probe (SEQ ID NO: 11) . The 
blot was prehybridized, then hybridized overnight with 10 6 
cpm/ml of probe of 42 °C according to the manufacturer ' s 
specifications. The blot was washed with 2X SSC, 0.05% 
SDS at room temperature, then in IX SSC, 0.1% SDS at 
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60°C. After a four-day exposure, signals were seen in 
trachea, aorta, bladder, and fetal kidney. 

Example 2 

5 Zsigl3 was mapped to chromosome 11 using the 

commercially available GeneBridge 4 Radiation Hybrid 
Panel (Research Genetics, Inc., Huntsville, AL) . The 
GeneBridge 4 Radiation Hybrid Panel contains PCRable DNAs 
from each of 93 radiation hybrid clones, plus two control 

10 DNAs (the HFL donor and the A23 recipient) . A publicly 
available WWW server (http : //www-genome . wi .mit . edu/cgi- 
bin/contig/rhmapper .pi) allows mapping relative to the 
Whitehead Institute/MIT Center for Genome Research 
(WICGR) radiation hybrid map of the human genome, which 

15 was constructed with the GeneBridge 4 Radiation Hybrid 
Panel . 

For the mapping of Zsigl3, 20 /jl! reaction 
mixtures were set up in a PCRable 96 -well microtiter 
plate (Stratagene Cloning Systems, La Jolla, CA) and 

20 incubated in a thermal cycler (RoboCycler™ Gradient 96; 

Stratagene Cloning Systems) . Each of the 95 PCR 

reactions consisted of 2 /xl 10X KlenTaq PCR reaction 
buffer (Clontech Laboratories, Inc.), 1.6 /xl dNTPs mix 
(2.5 mM each, Perkin-Elmer , Foster City, CA) , 1 /xl sense 

25 primer (ZC 13,508; SEQ ID NO:12), 1 /xl antisense primer 
(ZC 13,509; SEQ ID NO: 13), 2 /xl of a commercially 
available density increasing agent and tracking dye 
(RediLoad; Research Genetics, Inc., Huntsville, AL) , 0.4 
/xl of polymerase/antibody mixture (SOX Advantage™ KlenTaq 

30 Polymerase Mix; Clontech Laboratories, Inc.), 25 ng of 
DNA from an individual hybrid clone or control and ddH 2 0 
for a total volume of 20 /xl. The reaction mixtures were 
overlaid with an equal amount of mineral oil and sealed. 
The PCR cycler conditions were as follows: an initial 5 

35 minute denaturation at 95°C; 35 cycles of a 1 minute 
denaturation at 95°C, 1 minute annealing at 62°C and 1.5 
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minute extension at 72°C; followed by a final extension of 
7 minutes at 72°C. The reaction products were separated 
by electrophoresis on a 3% NuSieve® GTG agarose gel (FMC 
Bioproducts, Rockland, ME) . 
5 TjJ^^y^ ' Ii ^ e results showed that Zsigl3 maps 417.10 
crJsOOO dismal from the top of the human chromosome 11 
linkage groups on the WICGR radiation hybrid map. 
Proximal and daSstal framework markers were D11S1979 and 
D11S2384, respectively. The use of surrounding markers 
10 positions Zsigl3 i\ the llq22.1 region on the integrated 
LDB chromosome 11 \ap (The Genetic Location Database, 
University of \ Southhampton, WWW server: 

http : //cedar . genetics . Nsoton . ac . uk/public_html/) . This 
region of chromosome 11 iX fairly rich in proteases. 

15 

From the foregoing, it will be appreciated 
that, although specific embodiments of the invention have 
been described herein for purposes of illustration, 
various modifications may be made without deviating from 
2 0 the spirit and scope of the invention. Accordingly, the 
invention is not limited except as by the appended 
claims. 




